low-rank adaptation
PennyCoder: Efficient Domain-Specific LLMs for PennyLane-Based Quantum Code Generation
Basit, Abdul, Shao, Minghao, Asif, Muhammad Haider, Innan, Nouhaila, Kashif, Muhammad, Marchisio, Alberto, Shafique, Muhammad
--The growing demand for robust quantum programming frameworks has unveiled a critical limitation: current large language model (LLM) based quantum code assistants heavily rely on remote APIs, introducing challenges related to privacy, latency, and excessive usage costs. Addressing this gap, we propose PennyCoder, a novel lightweight framework for quantum code generation, explicitly designed for local and embedded deployment to enable on-device quantum programming assistance without external API dependence. PennyCoder leverages a fine-tuned version of the LLaMA 3.1-8B model, adapted through parameter-efficient Low-Rank Adaptation (LoRA) techniques combined with domain-specific instruction tuning optimized for the specialized syntax and computational logic of quantum programming in PennyLane, including tasks in quantum machine learning and quantum reinforcement learning. Unlike prior work focused on cloud-based quantum code generation, our approach emphasizes device-native operability while maintaining high model efficacy. We rigorously evaluated PennyCoder over a comprehensive quantum programming dataset, achieving 44.3% accuracy with our fine-tuned model (compared to 33.7% for the base LLaMA 3.1-8B and 40.1% for the RAG-augmented baseline), demonstrating a significant improvement in functional correctness. Quantum computing is rapidly evolving from a theoretical pursuit to a practical technology, propelled by advances in both hardware and software.
TLoRA: Tri-Matrix Low-Rank Adaptation of Large Language Models
We propose TLoRA, a novel tri-matrix low-rank adaptation method that decomposes weight updates into three matrices: two fixed random matrices and one trainable matrix, combined with a learnable, layer-wise scaling factor. This tri-matrix design enables TLoRA to achieve highly efficient parameter adaptation while introducing minimal additional computational overhead. Through extensive experiments on the GLUE benchmark, we demonstrate that TLoRA achieves comparable performance to existing low-rank methods such as LoRA and adapter-based techniques, while requiring significantly fewer trainable parameters. Analyzing the adaptation dynamics, we observe that TLoRA exhibits Gaussian-like weight distributions, stable parameter norms, and scaling factor variability across layers, further highlighting its expressive power and adaptability. Additionally, we show that TLoRA closely resembles LoRA in its eigenvalue distributions, parameter norms, and cosine similarity of updates, underscoring its ability to effectively approximate LoRA's adaptation behavior. Our results establish TLoRA as a highly efficient and effective fine-tuning method for LLMs, offering a significant step forward in resource-efficient model adaptation.
- Research Report > New Finding (0.66)
- Research Report > Promising Solution (0.46)
Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning
Lee, Na, Barmpas, Konstantinos, Panagakis, Yannis, Adamos, Dimitrios, Laskaris, Nikolaos, Zafeiriou, Stefanos
Foundation Models have demonstrated significant success across various domains in Artificial Intelligence (AI), yet their capabilities for brainwave modeling remain unclear. In this paper, we comprehensively evaluate current Large Brainwave Foundation Models (LBMs) through systematic fine-tuning experiments across multiple Brain-Computer Interface (BCI) benchmark tasks, including memory tasks and sleep stage classification. Our extensive analysis shows that state-of-the-art LBMs achieve only marginal improvements (0.9%-1.2%) over traditional deep architectures while requiring significantly more parameters (millions vs thousands), raising important questions about their efficiency and applicability in BCI contexts. Moreover, through detailed ablation studies and Low-Rank Adaptation (LoRA), we significantly reduce trainable parameters without performance degradation, while demonstrating that architectural and training inefficiencies limit LBMs' current capabilities. Our experiments span both full model fine-tuning and parameter-efficient adaptation techniques, providing insights into optimal training strategies for BCI applications. We pioneer the application of LoRA to LBMs, revealing that performance benefits generally emerge when adapting multiple neural network components simultaneously. These findings highlight the critical need for domain-specific development strategies to advance LBMs, suggesting that current architectures may require redesign to fully leverage the potential of foundation models in brainwave analysis.
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.46)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.93)
Lite VLA: Efficient Vision-Language-Action Control on CPU-Bound Edge Robots
Williams, Justin, Gupta, Kishor Datta, George, Roy, Sarkar, Mrinmoy
The deployment of artificial intelligence models at the edge is increasingly critical for autonomous robots operating in GPS-denied environments where local, resource-efficient reasoning is essential. This work demonstrates the feasibility of deploying small Vision-Language Models (VLMs) on mobile robots to achieve real-time scene understanding and reasoning under strict computational constraints. Unlike prior approaches that separate perception from mobility, the proposed framework enables simultaneous movement and reasoning in dynamic environments using only on-board hardware. The system integrates a compact VLM with multimodal perception to perform contextual interpretation directly on embedded hardware, eliminating reliance on cloud connectivity. Experimental validation highlights the balance between computational efficiency, task accuracy, and system responsiveness. Implementation on a mobile robot confirms one of the first successful deployments of small VLMs for concurrent reasoning and mobility at the edge. This work establishes a foundation for scalable, assured autonomy in applications such as service robotics, disaster response, and defense operations.
Fine-Tuning Open Video Generators for Cinematic Scene Synthesis: A Small-Data Pipeline with LoRA and Wan2.1 I2V
Akarsu, Meftun, Catay, Kerem, Vedat, Sedat Bin, Yarkan, Enes Kutay, Senturk, Ilke, Sar, Arda, Eksioglu, Dafne
We present a practical pipeline for fine-tuning open-source video diffusion transformers to synthesize cinematic scenes for television and film production from small datasets. The proposed two-stage process decouples visual style learning from motion generation. In the first stage, Low-Rank Adaptation (LoRA) modules are integrated into the cross-attention layers of the Wan2.1 I2V-14B model to adapt its visual representations using a compact dataset of short clips from Ay Yapim's historical television film El Turco. This enables efficient domain transfer within hours on a single GPU. In the second stage, the fine-tuned model produces stylistically consistent keyframes that preserve costume, lighting, and color grading, which are then temporally expanded into coherent 720p sequences through the model's video decoder. We further apply lightweight parallelization and sequence partitioning strategies to accelerate inference without quality degradation. Quantitative and qualitative evaluations using FVD, CLIP-SIM, and LPIPS metrics, supported by a small expert user study, demonstrate measurable improvements in cinematic fidelity and temporal stability over the base model. The complete training and inference pipeline is released to support reproducibility and adaptation across cinematic domains.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.06)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.06)
- Asia > Singapore (0.04)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
C-LoRA: Contextual Low-Rank Adaptation for Uncertainty Estimation in Large Language Models
Rahmati, Amir Hossein, Jantre, Sanket, Zhang, Weifeng, Wang, Yucheng, Yoon, Byung-Jun, Urban, Nathan M., Qian, Xiaoning
Low-Rank Adaptation (LoRA) offers a cost-effective solution for fine-tuning large language models (LLMs), but it often produces overconfident predictions in data-scarce few-shot settings. To address this issue, several classical statistical learning approaches have been repurposed for scalable uncertainty-aware LoRA fine-tuning. However, these approaches neglect how input characteristics affect the predictive uncertainty estimates. To address this limitation, we propose Contextual Low-Rank Adaptation (C-LoRA) as a novel uncertainty-aware and parameter efficient fine-tuning approach, by developing new lightweight LoRA modules contextualized to each input data sample to dynamically adapt uncertainty estimates. Incorporating data-driven contexts into the parameter posteriors, C-LoRA mitigates overfitting, achieves well-calibrated uncertainties, and yields robust predictions. Extensive experiments on LLaMA2-7B models demonstrate that C-LoRA consistently outperforms the state-of-the-art uncertainty-aware LoRA methods in both uncertainty quantification and model generalization. Ablation studies further confirm the critical role of our contextual modules in capturing sample-specific uncertainties. C-LoRA sets a new standard for robust, uncertainty-aware LLM fine-tuning in few-shot regimes. Although our experiments are limited to 7B models, our method is architecture-agnostic and, in principle, applies beyond this scale; studying its scaling to larger models remains an open problem. Our code is available at https://github.com/ahra99/c_lora.
- Europe (0.92)
- North America > United States > Texas (0.28)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation
Li, Shiwei, Luo, Xiandi, Wang, Haozhao, Tang, Xing, Cui, Ziqiang, Liu, Dugang, Li, Yuhua, He, Xiuqiang, Li, Ruixuan
Low-rank adaptation (LoRA) is a parameter-efficient fine-tuning (PEFT) method widely used in large language models (LLMs). LoRA essentially describes the projection of an input space into a low-dimensional output space, with the dimensionality determined by the LoRA rank. In standard LoRA, all input tokens share the same weights and undergo an identical input-output projection. This limits LoRA's ability to capture token-specific information due to the inherent semantic differences among tokens. To address this limitation, we propose Token-wise Projected Low-Rank Adaptation (TopLoRA), which dynamically adjusts LoRA weights according to the input token, thereby learning token-wise input-output projections in an end-to-end manner. Formally, the weights of TopLoRA can be expressed as $BΣ_X A$, where $A$ and $B$ are low-rank matrices (as in standard LoRA), and $Σ_X$ is a diagonal matrix generated from each input token $X$. Notably, TopLoRA does not increase the rank of LoRA weights but achieves more granular adaptation by learning token-wise LoRA weights (i.e., token-wise input-output projections). Extensive experiments across multiple models and datasets demonstrate that TopLoRA consistently outperforms LoRA and its variants. The code is available at https://github.com/Leopold1423/toplora-neurips25.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > China > Hubei Province (0.04)
- Asia > China > Hong Kong (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
Text to Trust: Evaluating Fine-Tuning and LoRA Trade-offs in Language Models for Unfair Terms of Service Detection
Juttu, Noshitha Padma Pratyusha, Singireddy, Sahithi, Gona, Sravani, Timilsina, Sujal
T erms of Service (T oS) agreements often contain clauses that are difficult to interpret and potentially unfair to users. Manual identification of such clauses is infeasible at scale, motivating the need for automated, accurate, and efficient detection methods. This study presents a comprehensive evaluation of clause-level unfairness detection using a diverse range of large language model (LLM) strategies, including full fine-tuning, parameter-efficient tuning, and zero-shot prompting. Experiments are conducted with full fine-tuning on BERT and DistilBERT, 4-bit quantized Low-Rank Adaptation (LoRA) applied to models such as TinyLlama and LLaMA, and to the legal domain-specific SaulLM, and evaluate zero-shot prompting using high-performing API-accessible models like GPT-4o and O3-mini. Evaluations are performed on the Claudette-T oS dataset from Hugging Face and further validated on the Multilingual Scraper of Privacy Policies and T erms of Service corpus, which comprises large-scale T oS documents collected from the web. Full fine-tuning delivers the strongest overall performance, parameter-efficient models offer a favorable accuracy-efficiency trade-off, and zero-shot prompting enables fast deployment with high recall. These results offer practical insights into building scalable and cost-effective unfairness detection systems for legal-tech applications.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Asia > Laos (0.04)
Low-Rank Adaptation of Neural Fields
Truong, Anh, Mahmoud, Ahmed H., Luković, Mina Konaković, Solomon, Justin
Processing visual data often involves small adjustments or sequences of changes, e.g., image filtering, surface smoothing, and animation. While established graphics techniques like normal mapping and video compression exploit redundancy to encode such small changes efficiently, the problem of encoding small changes to neural fields -- neural network parameterizations of visual or physical functions -- has received less attention. We propose a parameter-efficient strategy for updating neural fields using low-rank adaptations (LoRA). LoRA, a method from the parameter-efficient fine-tuning LLM community, encodes small updates to pre-trained models with minimal computational overhead. We adapt LoRA for instance-specific neural fields, avoiding the need for large pre-trained models and yielding lightweight updates. We validate our approach with experiments in image filtering, geometry editing, video compression, and energy-based editing, demonstrating its effectiveness and versatility for representing neural field updates.
- Europe (0.67)
- North America > United States > California (0.28)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
AILoRA: Function-Aware Asymmetric Initialization for Low-Rank Adaptation of Large Language Models
Ji, Xiaoshuang, Zhao, Zhendong, Gu, Xiaoyan, Chen, Xiaojun, Zhao, Xin, Liu, Zeyao
Parameter-efficient finetuning (PEFT) aims to mitigate the substantial computational and memory overhead involved in adapting large-scale pretrained models to diverse downstream tasks. Among numerous PEFT strategies, Low-Rank Adaptation (LoRA) has emerged as one of the most widely adopted approaches due to its robust empirical performance and low implementation complexity. In practical deployment, LoRA is typically applied to the $W^Q$ and $W^V$ projection matrices of self-attention modules, enabling an effective trade-off between model performance and parameter efficiency. While LoRA has achieved considerable empirical success, it still encounters challenges such as suboptimal performance and slow convergence. To address these limitations, we introduce \textbf{AILoRA}, a novel parameter-efficient method that incorporates function-aware asymmetric low-rank priors. Our empirical analysis reveals that the projection matrices $W^Q$ and $W^V$ in the self-attention mechanism exhibit distinct parameter characteristics, stemming from their functional differences. Specifically, $W^Q$ captures task-specific semantic space knowledge essential for attention distributions computation, making its parameters highly sensitive to downstream task variations. In contrast, $W^V$ encodes token-level feature representations that tend to remain stable across tasks and layers. Leveraging these insights, AILoRA performs a function-aware initialization by injecting the principal components of $W^Q$ to retain task-adaptive capacity, and the minor components of $W^V$ to preserve generalizable feature representations. This asymmetric initialization strategy enables LoRA modules to better capture the specialized roles of attention parameters, thereby enhancing both finetuning performance and convergence efficiency.
- Asia > China > Beijing > Beijing (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > Middle East > Jordan (0.04)